Rank-Based Outlier Detection

نویسندگان

  • H. Huang
  • Kishan Mehrotra
  • Chilukuri K. Mohan
  • K. Mehrotra
  • C. K. Mohan
چکیده

We propose a new approach for outlier detection, based on a new ranking measure that focuses on the question of whether a point is “important” for its nearest neighbors; using our notations low cumulative rank implies the point is central. For instance, a point centrally located in a cluster has relatively low cumulative sum of ranks because it is among the nearest neighbors of its own nearest neighbors. But a point at the periphery of a cluster has high cumulative sum of ranks because its nearest neighbors are closer to the points. Use of ranks eliminates the problem of density calculation in the neighborhood of the point and this improves performance. Our method performs better than several density-based methods, on some synthetic data sets as well as on some real data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Outlier Detection Based on the Distribution of Distances between Data Points

A novel approach to outlier detection on the ground of the properties of distribution of distances between multidimensional points is presented. The basic idea is to evaluate the outlier factor for each data point. The factor is used to rank the dataset objects regarding their degree of being an outlier. Selecting the points with the minimal factor values can then identify outliers. The main ad...

متن کامل

Mining Outlier Participants: Insights Using Directional Distributions in Latent Models

In this paper we will propose a new probabilistic topic model to score the expertise of participants on the projects that they contribute to based on their previous experience. Based on each participant’s score, we rank participants and define those who have the lowest scores as outlier participants. Since the focus of our study is on outliers, we name the model as Mining Outlier Participants f...

متن کامل

Improved Hybrid Clustering and Distance-based Technique for Outlier Removal

Outliers detection is a task that finds objects that are dissimilar or inconsistent with respect to the remaining data. It has many uses in applications like fraud detection, network intrusion detection and clinical diagnosis of diseases. Using clustering algorithms for outlier detection is a technique that is frequently used. The clustering algorithms consider outlier detection only to the poi...

متن کامل

Cell-DROS: A Fast Outlier Detection Method for Big Datasets

Outlier detection is one of the obstacles of big dataset analysis because of its time consumption issues. This paper proposes a fast outlier detection method for big datasets, which is a combination of cell-based algorithms and a ranking-based algorithm with various depths. A cell-based algorithm is proposed to transform a very large dataset to a fairly small set of weighted cells based on pred...

متن کامل

ODD 2 Workshop on Outlier Detection & Description under Data Diversity

In this talk I will briefly discuss recent advances in outlier detection, with a focus on distance-based techniques and discuss possible future directions in the context of rank-driven interactive analysis and data-guided explanations and visualizations. Time permitting we will examine such techniques in the context of real world analysis of multi-modal data including time series, graphs, text ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015